Information Regularization with Partially Labeled Data
نویسندگان
چکیده
Classification with partially labeled data requires using a large number of unlabeled examples (or an estimated marginal P (x)), to further constrain the conditional P (y|x) beyond a few available labeled examples. We formulate a regularization approach to linking the marginal and the conditional in a general way. The regularization penalty measures the information that is implied about the labels over covering regions. No parametric assumptions are required and the approach remains tractable even for continuous marginal densities P (x). We develop algorithms for solving the regularization problem for finite covers, establish a limiting differential equation, and exemplify the behavior of the new regularization approach in simple cases.
منابع مشابه
Partially labeled classification with Markov random walks
To classify a large number of unlabeled examples we combine a limited number of labeled examples with a Markov random walk representation over the unlabeled examples. The random walk representation exploits any low dimensional structure in the data in a robust, probabilistic manner. We develop and compare several estimation criteria/algorithms suited to this representation. This includes in par...
متن کاملThe determination of pair distance distributions by pulsed ESR using Tikhonov regularization.
Pulsed ESR techniques with the aid of site-directed spin labeling have proven useful in providing unique structural information about proteins. The determination of distance distributions in electron spin pairs directly from the dipolar time evolution of the pulsed ESR signals by means of the Tikhonov regularization method is reported. The difficulties connected with numerically inverting this ...
متن کاملSemi-supervised Feature Selection via Spectral Analysis
Feature selection is an important task in effective data mining. A new challenge to feature selection is the socalled “small labeled-sample problem” in which labeled data is small and unlabeled data is large. The paucity of labeled instances provides insufficient information about the structure of the target concept, and can cause supervised feature selection algorithms to fail. Unsupervised fe...
متن کاملRegularization and Semi-supervised Learning on Large Graphs
We consider the problem of labeling a partially labeled graph. This setting may arise in a number of situations from survey sampling to information retrieval to pattern recognition in manifold settings. It is also of potential practical importance, when the data is abundant, but labeling is expensive or requires human assistance. Our approach develops a framework for regularization on such grap...
متن کاملAutomatic estimation of regularization parameter by active constraint balancing method for 3D inversion of gravity data
Gravity data inversion is one of the important steps in the interpretation of practical gravity data. The inversion result can be obtained by minimization of the Tikhonov objective function. The determination of an optimal regularization parameter is highly important in the gravity data inversion. In this work, an attempt was made to use the active constrain balancing (ACB) method to select the...
متن کامل